A More Relaxed Model for Graph-Based Data Clustering: s-Plex Cluster Editing
نویسندگان
چکیده
We introduce the s-Plex Cluster Editing problem as a generalization of the wellstudied Cluster Editing problem, both being NP-hard and both being motivated by graph-based data clustering. Instead of transforming a given graph by a minimum number of edge modifications into a disjoint union of cliques (this is Cluster Editing), the task in the case of s-Plex Cluster Editing is to transform a graph into a cluster graph consisting of a disjoint union of so-called splexes. Herein, an s-plex is a vertex set S inducing a subgraph in which every vertex has degree at least |S| − s. Cliques are 1-plexes. The advantage of s-plexes for s ≥ 2 is that they allow to model a more relaxed cluster notion (s-plexes instead of cliques), better reflecting inaccuracies of the input data. We develop a provably effective preprocessing based on data reduction (yielding a so-called problem kernel), a forbidden subgraph characterization of s-plex cluster graphs, and a depth-bounded search tree which is used to find optimal edge modifications sets. Altogether, this yields efficient algorithms in case of moderate numbers of edge modifications, this often being a reasonable assumption under a maximum parsimony model for data clustering.
منابع مشابه
A More Relaxed Model for Graph-Based Data Clustering: s-Plex Editing
We introduce the s-Plex Editing problem generalizing the well-studied Cluster Editing problem, both being NP-hard and both being motivated by graph-based data clustering. Instead of transforming a given graph by a minimum number of edge modifications into a disjoint union of cliques (Cluster Editing), the task in the case of s-Plex Editing is now to transform a graph into a disjoint union of so...
متن کاملParameterized Algorithmics for Network Analysis: Clustering & Querying
Preface This thesis summarizes some of my results on NP-hard graph problems that have applications in the areas of network clustering and querying. The research for obtaining these results was Forschungsgemeinschaft (DFG), as a researcher in the DFG project " Parameterized Algorithmics for Bioinformatics " (PABI, NI 369/7). I want to express my gratitude to Rolf Niedermeier for giving me the op...
متن کاملParameterized Algorithmics for Network Analysis: Clustering & Querying
Preface This thesis summarizes some of my results on NP-hard graph problems that have applications in the areas of network clustering and querying. The research for obtaining these results was Forschungsgemeinschaft (DFG), as a researcher in the DFG project " Parameterized Algorithmics for Bioinformatics " (PABI, NI 369/7). I want to express my gratitude to Rolf Niedermeier for giving me the op...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملKernelization Through Tidying---A Case Study Based on s-Plex Cluster Vertex Deletion
We introduce the NP-hard graph-based data clustering problem s-Plex Cluster Vertex Deletion, where the task is to delete at most k vertices from a graph so that the connected components of the resulting graph are s-plexes. In an s-plex, every vertex has an edge to all but at most s − 1 other vertices; cliques are 1-plexes. We propose a new method for kernelizing a large class of vertex deletion...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Discrete Math.
دوره 24 شماره
صفحات -
تاریخ انتشار 2010